memory strength
Human-like Forgetting Curves in Deep Neural Networks
This study bridges cognitive science and neural network design by examining whether artificial models exhibit human-like forgetting curves. Drawing upon Ebbinghaus' seminal work on memory decay and principles of spaced repetition, we propose a quantitative framework to measure information retention in neural networks. Our approach computes the recall probability by evaluating the similarity between a network's current hidden state and previously stored prototype representations. This retention metric facilitates the scheduling of review sessions, thereby mitigating catastrophic forgetting during deployment and enhancing training efficiency by prompting targeted reviews. Our experiments with Multi-Layer Perceptrons reveal human-like forgetting curves, with knowledge becoming increasingly robust through scheduled reviews. This alignment between neural network forgetting curves and established human memory models identifies neural networks as an architecture that naturally emulates human memory decay and can inform state-of-the-art continual learning algorithms.
- Education (0.68)
- Health & Medicine > Therapeutic Area (0.47)
Fine-Grained Gradient Restriction: A Simple Approach for Mitigating Catastrophic Forgetting
Liu, Bo, Ye, Mao, Stone, Peter, Liu, Qiang
A fundamental challenge in continual learning is to balance the trade-off between learning new tasks and remembering the previously acquired knowledge. Gradient Episodic Memory (GEM) achieves this balance by utilizing a subset of past training samples to restrict the update direction of the model parameters. In this work, we start by analyzing an often overlooked hyper-parameter in GEM, the memory strength, which boosts the empirical performance by further constraining the update direction. We show that memory strength is effective mainly because it improves GEM's generalization ability and therefore leads to a more favorable trade-off. By this finding, we propose two approaches that more flexibly constrain the update direction. Our methods are able to achieve uniformly better Pareto Frontiers of remembering old and learning new knowledge than using memory strength. We further propose a computationally efficient method to approximately solve the optimization problem with more constraints.
Investigating Context-Faithfulness in Large Language Models: The Roles of Memory Strength and Evidence Style
Li, Yuepei, Zhou, Kang, Qiao, Qiao, Nguyen, Bach, Wang, Qing, Li, Qi
Retrieval-augmented generation (RAG) improves Large Language Models (LLMs) by incorporating external information into the response generation process. However, how context-faithful LLMs are and what factors influence LLMs' context-faithfulness remain largely unexplored. In this study, we investigate the impact of memory strength and evidence presentation on LLMs' receptiveness to external evidence. We introduce a method to quantify the memory strength of LLMs by measuring the divergence in LLMs' responses to different paraphrases of the same question, which is not considered by previous works. We also generate evidence in various styles to evaluate the effects of evidence in different styles. Two datasets are used for evaluation: Natural Questions (NQ) with popular questions and popQA featuring long-tail questions. Our results show that for questions with high memory strength, LLMs are more likely to rely on internal memory, particularly for larger LLMs such as GPT-4. On the other hand, presenting paraphrased evidence significantly increases LLMs' receptiveness compared to simple repetition or adding details.
- North America > United States > Michigan (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- North America > United States > Ohio (0.04)
- (10 more...)
Benchmarking Sensitivity of Continual Graph Learning for Skeleton-Based Action Recognition
Wei, Wei, De Schepper, Tom, Mets, Kevin
Continual learning (CL) is the research field that aims to build machine learning models that can accumulate knowledge continuously over different tasks without retraining from scratch. Previous studies have shown that pre-training graph neural networks (GNN) may lead to negative transfer (Hu et al., 2020) after fine-tuning, a setting which is closely related to CL. Thus, we focus on studying GNN in the continual graph learning (CGL) setting. We propose the first continual graph learning benchmark for spatio-temporal graphs and use it to benchmark well-known CGL methods in this novel setting. The benchmark is based on the N-UCLA and NTU-RGB+D datasets for skeleton-based action recognition. Beyond benchmarking for standard performance metrics, we study the class and task-order sensitivity of CGL methods, i.e., the impact of learning order on each class/task's performance, and the architectural sensitivity of CGL methods with backbone GNN at various widths and depths. We reveal that task-order robust methods can still be class-order sensitive and observe results that contradict previous empirical observations on architectural sensitivity in CL.
- Europe > Belgium > Flanders > Antwerp Province > Antwerp (0.04)
- North America > United States > Ohio > Franklin County > Columbus (0.04)
- North America > United States > Nevada > Clark County > Las Vegas (0.04)
- (8 more...)
Neural Storage: A New Paradigm of Elastic Memory
Chakraborty, Prabuddha, Bhunia, Swarup
Storage and retrieval of data in a computer memory plays a major role in system performance. Traditionally, computer memory organization is static - i.e., they do not change based on the application-specific characteristics in memory access behaviour during system operation. Specifically, the association of a data block with a search pattern (or cues) as well as the granularity of a stored data do not evolve. Such a static nature of computer memory, we observe, not only limits the amount of data we can store in a given physical storage, but it also misses the opportunity for dramatic performance improvement in various applications. On the contrary, human memory is characterized by seemingly infinite plasticity in storing and retrieving data - as well as dynamically creating/updating the associations between data and corresponding cues. In this paper, we introduce Neural Storage (NS), a brain-inspired learning memory paradigm that organizes the memory as a flexible neural memory network. In NS, the network structure, strength of associations, and granularity of the data adjust continuously during system operation, providing unprecedented plasticity and performance benefits. We present the associated storage/retrieval/retention algorithms in NS, which integrate a formalized learning process. Using a full-blown operational model, we demonstrate that NS achieves an order of magnitude improvement in memory access performance for two representative applications when compared to traditional content-based memory.
- North America > United States > Florida > Alachua County > Gainesville (0.14)
- North America > United States > Texas (0.04)
- North America > United States > Ohio > Cuyahoga County > Cleveland (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology (0.68)
- Semiconductors & Electronics (0.67)
- Energy (0.67)
- Information Technology > Communications > Networks (0.92)
- Information Technology > Artificial Intelligence > Robots (0.92)
- Information Technology > Hardware > Memory (0.88)
- (3 more...)
A Stabilized Feedback Episodic Memory (SF-EM) and Home Service Provision Framework for Robot and IoT Collaboration
The automated home referred to as Smart Home is expected to offer fully customized services to its residents, reducing the amount of home labor, thus improving human beings' welfare. Service robots and Internet of Things (IoT) play the key roles in the development of Smart Home. The service provision with these two main components in a Smart Home environment requires: 1) learning and reasoning algorithms and 2) the integration of robot and IoT systems. Conventional computational intelligence-based learning and reasoning algorithms do not successfully manage dynamic changes in the Smart Home data, and the simple integrations fail to fully draw the synergies from the collaboration of the two systems. To tackle these limitations, we propose: 1) a stabilized memory network with a feedback mechanism which can learn user behaviors in an incremental manner and 2) a robot-IoT service provision framework for a Smart Home which utilizes the proposed memory architecture as a learning and reasoning module and exploits synergies between the robot and IoT systems. We conduct a set of comprehensive experiments under various conditions to verify the performance of the proposed memory architecture and the service provision framework and analyze the experiment results.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Asia > South Korea > Daejeon > Daejeon (0.04)
- Europe > Germany > Berlin (0.04)
- (13 more...)
- Research Report (0.50)
- Workflow (0.46)
Mindful Active Learning
Ashari, Zhila Esna, Ghasemzadeh, Hassan
We propose a novel active learning framework for activity recognition using wearable sensors. Our work is unique in that it takes physical and cognitive limitations of the oracle into account when selecting sensor data to be annotated by the oracle. Our approach is inspired by human-beings' limited capacity to respond to external stimulus such as responding to a prompt on their mobile devices. This capacity constraint is manifested not only in the number of queries that a person can respond to in a given time-frame but also in the lag between the time that a query is made and when it is responded to. We introduce the notion of mindful active learning and propose a computational framework, called EMMA, to maximize the active learning performance taking informativeness of sensor data, query budget, and human memory into account. We formulate this optimization problem, propose an approach to model memory retention, discuss complexity of the problem, and propose a greedy heuristic to solve the problem. We demonstrate the effectiveness of our approach on three publicly available datasets and by simulating oracles with various memory strengths. We show that the activity recognition accuracy ranges from 21% to 97% depending on memory strength, query budget, and difficulty of the machine learning task. Our results also indicate that EMMA achieves an accuracy level that is, on average, 13.5% higher than the case when only informativeness of the sensor data is considered for active learning. Additionally, we show that the performance of our approach is at most 20% less than experimental upper-bound and up to 80% higher than experimental lower-bound. We observe that mindful active learning is most beneficial when query budget is small and/or oracle's memory is weak, thus emphasizing contributions of our work in human-centered mobile health settings and for elderly with cognitive impairments.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Austria > Vienna (0.14)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- (9 more...)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.34)
- Health & Medicine > Therapeutic Area > Neurology (0.34)